Untargeted metabolomics reveal signatures of a healthy lifestyle

This cross-sectional study investigated differences in the plasma metabolome in two groups of adults that were of similar age but varied markedly in body composition and dietary and physical activity patterns. Study participants included 52 adults in the lifestyle group (LIFE) (28 males, 24 females) and 52 in the control group (CON) (27 males, 25 females). The results using an extensive untargeted ultra high-performance liquid chromatography-high resolution mass spectrometry (UHPLC-HRMS) metabolomics analysis with 10,535 metabolite peaks identified 486 important metabolites (variable influence on projections scores of VIP ≥ 1) and 16 significantly enriched metabolic pathways that differentiated LIFE and CON groups. A novel metabolite signature of positive lifestyle habits emerged from this analysis highlighted by lower plasma levels of numerous bile acids, an amino acid profile characterized by higher histidine and lower glutamic acid, glutamine, β-alanine, phenylalanine, tyrosine, and proline, an elevated vitamin D status, higher levels of beneficial fatty acids and gut microbiome catabolism metabolites from plant substrates, and reduced levels of N-glycan degradation metabolites and environmental contaminants. This study established that the plasma metabolome is strongly associated with body composition and lifestyle habits. The robust lifestyle metabolite signature identified in this study is consistent with an improved life expectancy and a reduced risk for chronic disease.


Study design and methods
This study employed a cross-sectional design that compared metabolite profiles in adults adhering (n = 52) or not adhering (n = 52) to lifestyle recommendations.Questionnaires provided information about their lifestyle and physical activity patterns, mood states, estimated VO2max, nutrient intake, and medical history 30 .
Participants reported to the lab at the scheduled appointment time in an overnight fasted state (i.e., no food, supplements, or beverages other than water for at least the previous 8 h).After 10-15 min of seated rest, resting heart rate (RHR) and blood pressure were measured.A 35 ml blood sample was collected from an arm vein.Venous blood samples were collected in ethylenediaminetetraacetic acid (EDTA) containing blood collection tubes.Plasma aliquots were prepared from EDTA containing blood collection tubes and stored in a − 80 °C freezer until analysis for metabolomics.Participants were taken into the performance lab for measurements of height, weight, waist circumference, sagittal abdominal diameter, leg/back and hand grip dynamometer strength, and body fat (bioelectrical impedance or BIA) 30 .

Untargeted metabolomics data capture and preprocessing
Details of the sample preparation, data acquisition, data preprocessing and metabolite identification and annotation [31][32][33] are provided in the Supplementary Methods Material.Briefly, untargeted metabolomics data of randomized plasma samples (interspersed with 10% blanks, quality control study pools (QCSP), and NIST SRM 1950 plasma reference material) was acquired in positive mode on a Vanquish UHPLC system coupled with a Q Exactive™ HF-X Hybrid Quadrupole-Orbitrap™ Mass Spectrometer (UHPLC-HRMS; Thermo Fisher Scientific, San Jose, CA).Raw files for all study samples, QCSP, blank, and NIST reference material runs were uploaded to Progenesis QI (Waters Corporation, Milford, MA) for alignment and peak picking.Data was normalized 34,35 to a reference QCSP sample using the "normalize to all" function in Progenesis QI 36 .Peaks detected by UHPLC-HRMS were identified or annotated using ADAP-KDB software 37 .The evidence basis for metabolite identifications and annotations [31][32][33] to the in-house physical standards library (Ontology Level, OL), or Public Databases (PD), are described in the Supplementary Methods Material.It should be noted that metabolomics platforms cannot always distinguish between isomers and that multiple peaks may match the same compound.Additionally, one metabolic peak may match to multiple metabolites due to adduct formation or isobaric compounds.

Multivariate data analysis, and univariate statistics
Multivariate analysis was performed for the normalized UHPLC-HRMS data, using SIMCA 17.0 to reduce the dimensionality and to enable the visualization of the differentiation of the phenotypic groups (SIMCA 17, Sartorius Stedim Data Analytics, AB, Umeå, Sweden) 38,39 .Unsupervised multivariate analysis models were created using principal component analysis (PCA) and the scores plots were inspected to ensure that the QCSP samples were tightly clustered, and in the center of the study samples from which they were derived-a quality control method that is widely used in metabolomic studies 40 .Orthogonal partial least squares discriminate analysis (OPLS-DA) was used to determine the variable influence on projection (VIP), for the preprocessed UHPLC-HRMS data, to define the signals deemed important for differentiating the phenotypic groups.VIP ≥ 1.0 with a jack-knife confidence interval that did not include 0 were selected as important.The VIP statistic summarizes the importance of the signal in differentiating the phenotypic groups 39 .All models used a sevenfold cross-validation to assess the predictive ability (Q2) of the model.Additional statistical analyses were conducted using SAS 9.4 (SAS Institute Inc., Cary, NC), and included using a two-sided t-test with the Satterthwaite correction for unequal variances.Nominal p-values are reported for the comparison between lifestyle and the controls because this exploratory analysis was not powered for a specific hypothesis [41][42][43] .Metabolite peaks that met VIP ≥ 1 or p < 0.10 fold group differences ≥ │2.0│ were reported for the differentiation of the phenotypic groups in the metabolomics analysis.This discovery study did not use FDR correction because the study was not powered for a specific hypothesis [41][42][43] .This study determined linear combinations of metabolites that have variable importance to projection scores ≥ 1 that are important for determining differences between the lifestyle and control groups.P-values were reported in all cases.While some of these p-values may not be significant after FDR correction, these metabolites are still important to the signature that differentiates the lifestyle and control groups.

Lasso modeling
Lasso regression was used to consider another statistical model that reduces errors caused by overfitting.Group (LIFE, CON) discriminators based on the entire metabolomics dataset were constructed using penalized logistic regression.This analysis was conducted using the R packages "glmnet" (cran.r-project.org/web/packages/glmnet/) 44,45 with the alpha parameter set to 1.0.This is equivalent to "Lasso" regression wherein the number of predictor variables in the categorization model is minimized.The normalized intensities for all metabolite peaks across 104 subjects were input into the algorithm with group status (LIFE, CON) as the binary category to be predicted.As it is well known that even penalized regression techniques will over-fit the training data when the number of predictor variables is much larger than the number of samples, a leave-one-out (LOO) protocol was used to get an estimate of how well a discriminator trained on this data might perform on new samples.The LOO approach consists of iterating over all N samples in the dataset.At each step one of the samples is withheld while a model is optimized over the other N-1 samples and a prediction is made for the sample that was held out.We computed receiver-operating characteristic (ROC) curves from the LIFE and CON group predictions obtained with a simple LASSO regression model.To compare ROC curves, we computed the area under the curve (AUC) and a p-value for obtaining such an AUC at random.These calculations were performed using the R package "pROC" (https:// cran.r-project.org/ web/ packa ges/ pROC/ index.html).

Pathway enrichment and biological interpretation
Pathway enrichment analysis was conducted using the Mummichog algorithm 46 in Functional Analysis module in Metaboanalyst 5.0 [47][48][49] .The 10,535 features (m/z) that remained after data preprocessing were entered together with the mass-to-charge ratio (m/z), retention time, positive mode, the p-value, and fold change between the comparison of lifestyle versus controls for all subjects.A p-value cut-off of 0.05 was used for the size of the permutation group that the algorithm used for selecting significant features for metabolite matching.A 3 ppm mass tolerance was used for mass accuracy for annotating peaks to metabolites and identifying candidate pathways.All possible metabolites that were matched by m/z were searched in the Homo sapiens (human) [MFN] pathway library.The experimental list of metabolites was compared to a null distribution of randomly generated m/z features from the reference library to determine pathway significance 46 .Significance was reported as uncorrected p-values.In addition to the pathway analysis using MetaboAnalyst, biochemical pathway interpretation was conducted with a classical approach of assessing the connection between analytes that met the criteria for being most important (VIP ≥ 1 and p < 0.05 for group fold differences ≥ │1.8│) between LIFE and CON groups.Some metabolites are represented in more than one metabolic pathway.

Subject characteristics
This study employed a cross-sectional design and compared 52 subjects in the lifestyle group (LIFE) and 52 in the control group (CON) (Table 1).The sex distribution was comparable between groups (LIFE, 28 males, 24 females) and CON 27males,25females (Χ 2 = 0.039, p = 0.844).Analyses were conducted for all study participants combined.Age, education level, and height did not differ significantly between LIFE and CON groups (Table 1).Several measures of body composition differed between LIFE and CON groups (Table 1).These included the body mass index (BMI), fat mass index (FMI), body fat percentage, and sagittal abdominal diameter (SAD).Estimated aerobic capacity (VO2max) and total physical activity calculated as MET-min/week differed between the LIFE and CON groups (p < 0.001) (Table 1).Fruit and vegetable intake was higher and red meat intake lower

Multivariate and univariate statistics
The supervised OPLS-DA for plasma samples from the LIFE and CON groups (Fig. 1) showed strong model statistics for outcome (R2Y = 0.959) and prediction (Q2 = 0.523, sevenfold cross validation).Over 5300 signals met the criteria of VIP ≥ 1 or p < 0.10 or a fold change ≥ │2│ (Supplementary Material, Table S3).Over 3200 signals had p < 0.10, and over 2400 signals had p < 0.05 for comparison between LIFE and CON.Over 1200 signals had p < 0.01 for comparisons between LIFE and CON groups.A total of 486 important metabolite peaks (VIP ≥ 1) were matched to metabolites in the in-house physical standards library using ADAP-KDB software (Supplementary Material, Table S4).The most important metabolite peaks library matched to metabolites (VIP ≥ 1 and p ≤ 0.05 or fold group differences ≥ │1.8│) are shown in the Table 2.

Lasso modeling
Receiver-operator-characteristic (ROC) curves from LOO and the single over-fit models are shown in Fig. 2. The area under the curve (AUC) for the single model was 1.0 (p-value 7.7e−19) and for the LOO models was 0.96 (p-value 5.6e−16).The Lasso modeling approach resulted in the identification of 55 metabolite peaks with all in common with important metabolite peaks listed in Table S3.The summary can be found in Supplementary Table S5).
LOO cross validation (LOOCV) predictions using all metabolomics data points are summarized in Fig. 3 for key lifestyle traits.Older age was strongly related to the metabolomics data with no differences between the LIFE and CON groups (r = 0.80, p = value = 5e−24).LIFE and CON group membership was strongly predicted using the plasma metabolomics data for three different body composition outcomes including BMI (r = 0.84, p-value = 3e−29), percent body fat (r = 0.80, p-value = 7e−24), and the sagittal abdominal diameter (SAD) (r = 0.82, p-value = 6e−27), and moderately predicted for the average number of daily servings of fruits and vegetables combined (r = 0.66, p-value = 3e−14), and the days per week for moderate-to-vigorous physical activity (MVPA) (r = 0.68, p-value = 4e−15).S6 for the comparison of LIFE and CON groups.Signals associated with these enriched pathways, that were statistically significant between LIFE and CON groups, and that were identified or annotated using our in-house physical standards library are described in Table S7.

Discussion
This cross-sectional study focused on the metabotype profile associated with a healthy lifestyle.The LIFE and CON groups were of similar age, education, and sex distribution, but differed significantly in body composition and exercise and dietary patterns.The proteomics dataset previously published from this cross-sectional study showed strong group differences for 39 proteins supporting a lower innate immune activation signature and greater lipoprotein metabolism and HDL remodeling in the LIFE group 30 .In this analysis, untargeted metabolomics of more than 10,000 metabolite peaks revealed a distinct difference in the plasma metabolome between  S4 for additional important metabolite peaks with statistics.a Mass spectrometry metabolomics platforms cannot always distinguish between isomers, and multiple peaks may match the same compound.Additionally, one peak may match multiple metabolites due to adduct formation or isobaric compounds.For the complete list of metabolites annotations or identifications of metabolite peaks see Supplementary Tables S1 and S2.b t-test with Satterthwaite correction for unequal variances.c Positive fold difference indicates that the mean value for the LIFE group was greater than for the CON group.the LIFE and CON groups.Multivariate LOO modeling confirmed that group status (LIFE vs. CON) was strongly predicted by the metabolite signature and exceeded the prediction model from the proteomics data 30 .Numerous metabolites were identified/annotated that most significantly differentiated LIFE and CON groups.An enriched pathway analysis using Mummichog indicated group differences for 16 metabolic pathways highlighted by contrasts in bile acid and amino acid metabolism.The reduced plasma bile acid signature in the LIFE vs. CON group is a novel and important finding from this cross-sectional study.Plasma hydroxycholesterol, a cholesterol precursor in primary bile acid metabolism, and more than 10 primary and secondary bile acids were significantly lower in the LIFE versus CON groups.Other studies indicate that plasma bile acid concentrations vary widely between individuals and that this variance is due to lifestyle, gut microbial, and genetic factors 50 .Normally enterohepatic circulation of bile acids is very efficient and only a small proportion of bile acids escape into the systemic circulation 50 .Circulating bile acids at normal low concentrations have regulatory functions and exert signaling functions in peripheral tissues and organs through specific nuclear receptors including the farnesoid X receptor (FXR) and the Takeda G protein-coupled receptor 1 (TGR5) 50,51 .Emerging data indicate that individuals with obesity and various diseases including type 2 diabetes mellitus have elevated plasma bile acid concentrations in the fasted state 48,49 .One study showed that even in young and relatively healthy adults, plasma bile acid levels were associated with cardiometabolic and inflammatory disease risk biomarkers 52 .A 14-week exercise and weight loss intervention study demonstrated that total fasting bile acids decreased by 30% accompanied by a 55% increase in serum levels of the rate-limiting enzyme cholesterol 7 alpha-hydroxylase (CYP7A1) 53 .Limited data suggest that aerobic capacity influences bile acid metabolism 54 and that intake of dietary fiber and polyphenols from whole plant foods have a significant effect on the gut microbiome and bile acid metabolism and related signaling pathways 55,56 .Additional human systems biology-based studies using a variety of multiomics approaches will broaden current understandings regarding the specific and combined lifestyle relationships of body composition and dietary and exercise patterns on bile acid metabolism 50 .
LIFE versus CON group differences were found for seven of 20 standard amino acids, with higher histidine and lower glutamic acid, glutamine, β-alanine, phenylalanine, tyrosine, and proline.This LIFE-related amino acid signature was spread across seven different metabolic pathways including histidine, lysine, pyrimidine, amino sugars, β-alanine, tyrosine, and butanoate metabolism.Some aspects of this LIFE versus obese-CONrelated amino acid signature have been reported by others, but the literature is far from consistent 18,26 .There is agreement that amino acid metabolism is extensively altered in various disease states and influenced by body composition and lifestyle habits 57 .In a cross-sectional study with obese and non-obese women serum amino acids including histidine, arginine, threonine, glycine, lysine, and serine were found to be significantly lower in obese women as compared to non-obese controls, similar to our results 58 .In our study, the most important LIFE versus CON contrast was for glutamic acid, an acidic, non-essential amino acid that is involved in numerous metabolic pathways.The plasma concentration of glutamic acid levels is inversely related to visceral adipose tissue and may be influenced by obesity-induced changes in the gut microbiota 59 .Pathway enrichment for LIFE versus CON identified the histidine metabolic pathway as most affected with higher levels of histidine, 4-imidazoleacetic acid, and l-formiminoglutamic acid and lower levels of glutamic acid in the LIFE group.Histidine is an essential amino acid and has been positively associated with insulin sensitivity, obesity, liver and kidney disease, and heart failure, and inversely related to inflammation and oxidative stress 58,60 .The gut microbiome appears to play a key role in regulating diet histidine bioavailability 61 .Plasma levels of branched chain amino acids (BCAAs) did not differ between LIFE and CON groups in contrast to other studies that have noted elevated plasma BCAA levels in obese groups 18,26 .The literature is mixed, however, regarding plasma BCAA levels and associations with adiposity, longevity, sarcopenia, and diabetes 62 .5-hydroxylysine was increased and other lysine metabolites were decreased in LIFE versus CON groups.Limited data indicate that obesity may be related to enhanced lysine degradation via the saccharopine pathway 63 .Lysine is subjected to diverse enzyme-catalyzed post-translational modifications (PTMs), including methylation, acetylation, crotonylation, ubiquitination, and SUMOylation.Acetyllysine (or acetylated lysine) is an acetylderivative of the amino acid lysine.In proteins, the acetylation of lysine residues is an important mechanism of epigenetics.Free trimethyllysine (TML) is involved in the carnitine biosynthesis pathway, where it acts as the first intermediate in a series of four enzymatic reactions to generate l-carnitine 64 .TML is an important post-translationally modified amino acid with functions in carnitine biosynthesis and regulation of key epigenetic processes.The dataset from this cross-sectional study support lower levels of lysine degradation in the LIFE group, and the clinical significance of this finding remains to be determined.In contrast, pipecolic acid, an l-alpha amino acid metabolite product of lysine microbiome catabolism and a marker of dry bean intake 65 was elevated in the LIFE group with a high VIP value of 2. There is increasing evidence that pipecolic acid is an important regulator of immunity in both plants and humans 66 .
Metabolites from the pyrimidine metabolism pathway including uracil, uridine, thymine, and 5-methylcytosine were higher in LIFE versus CON groups, with lower levels of glutamine, cytidine, cytosine, and pseudouridine.Uridine is an uracil nucleoside that is involved in a variety of biological functions including RNA and Figure 3. Leave-one-out cross validation (LOOCV) predictions using all metabolomics data for selected traits.Older age was strongly related to the metabolomics data with no differences between the LIFE and CON groups (r = 0.80, p = value = 5e−24).LIFE and CON group membership was strongly predicted using the plasma metabolomics data for three different body composition outcomes including BMI (r = 0. www.nature.com/scientificreports/DNA biosynthesis, glucose and lipid metabolism, glycogen deposition, insulin sensitivity, energy homeostasis, protein and lipid glycosylation, extracellular matrix biosynthesis, and detoxification of xenobiotics 67 .Limited human data indicate that plasma uridine levels are inversely related to obesity 68 .In mice, uridine supplementation attenuates HFD-induced obesity and NAFLD 69 .A high uridine to pseudouridine ratio (as shown in the LIFE group) has been linked to a reduced risk for stroke 70 .Uridine decreases oxidative stress and inflammation in vitro and was linked to lower levels of aging indicators in mice 71 .Thus, alterations in plasma metabolites related to the pyrimidine pathway may serve as important and novel biomarkers of lifestyle habits and reduced disease risk.Lifestyle habits had a positive influence on vitamin D3 (cholecalciferol) metabolism with higher plasma calcifediol (25(OH)D 3 ) and calcitriol (1,25(OH) 2 D 3 ) and lower 1,24,25-trihydroxyvitamin D3 (a 1,25(OH) 2 D 3 catabolism metabolite) in the LIFE versus CON groups.A poor vitamin D status has been linked to obesity and numerous clinical conditions including the metabolic syndrome, type 2 diabetes mellitus, systemic inflammation, autoimmune disorders, and neurodegenerative diseases [70][71][72][73][74][75] .Underlying mechanisms for low vitamin D status in obese populations are unclear but may be related in part to reduced outdoor physical activity and volumetric dilution due to greater volumes of adipose tissue 73 .
The N-glycan degradation pathway analysis indicated reduced plasma levels in the LIFE group for mannose, galactose, N-acetylglucosamine, N-acetylneuraminic acid, and fucose.N-glycans (oligosaccharide-protein molecules) are basic components of cell membranes and secreted proteins and help regulate multiple physiological processes.In humans, N-glycosylation involves collections of mannose, galactose, fucose, and sialic acids including N-acetylneuraminic acid and N-acetylglucosamine. Sialic acids are acidic sugars typically located at the terminal positions of glycoproteins 76,77 .The amino sugars N-acetyl-d-mannosamine and N-acetyl-d-glucosamine (lower plasma levels in the LIFE group) are essential precursors of sialic acids.Plasma N-glycans and sialic acid levels are rather stable in healthy individuals over time but can be altered due to physiological, pathological, or lifestyle changes 76,77 .For example, elevated plasma levels of N-acetylneuraminic acid and N-acetylglucosamine have emerged as potential metabolic markers for inflammation, coronary artery disease progression, and a variety of other diseases 78,79 .Elevated plasma mannose has been reported in obese adults and is now considered a biomarker for future risk of several chronic diseases 18,80,81 .Increased l-fucose in serum and urine is a potential    82,83 .The markedly lower plasma levels of N-glycan degradation metabolites in the LIFE group supports the interpretation of reduced chronic disease risk due to positive lifestyle habits.N-acetylglucosamine when polymerized with glucuronic acid forms heparin sulfate and is distributed throughout connective, neural, and epithelial tissues.Lower levels of plasma N-acetylglucosamine support the pathway analysis finding of a lowered degradation of heparan sulfate in the LIFE group 84 .Due to limitations in the curation of metabolites in the library of the Mummichog analysis modules, metabolites including those related to gut microbiome catabolism of food substrates and environmental contaminants were not included in the pathway analysis.Several gut microbiome metabolites reflecting a higher intake of plant-based foods and enhanced gut microbiome alpha diversity were elevated in the LIFE versus CON group including hippuric acid, cinnamoylglycine, cinnamic acids, 3,4-dimethoxyphenylacetic acid, 3-phenylpropanoic acid, and 2-phenylpropionate.An elevated gut microbial metabolite signature in adults with higher lifestyle scores has been reported previously 85 .Citric acid cycle metabolites generated from the butanoate metabolism pathway differed between LIFE and CON groups, with higher levels of succinic acid.The butanoate metabolism pathway involves short chain fatty acids (SCFA) produced by bacterial fermentation of undigested carbohydrates (including dietary fiber) and proteins.SCFAs are precursors for numerous metabolites including succinic acid that helps regulate cellular nutrient metabolism and white adipose tissue deposition, muscle fiber remodeling during recovery from exercise, and immune system function 86 .Plasma levels of two disaccharides, lactose and sucrose, are indicators of a leaky gut syndrome and were lower in the LIFE versus CON group.
Plasma levels of numerous environmental contaminants were lower in the LIFE versus CON groups including propham (a potato herbicide), fenoxycarb (carbamate-based insecticide metabolite), monocyclohexyl phthalate and (5-Carboxy-2-ethylpentyl)phthalate (plasticizer metabolites), prometon (an herbicide), 3-hydroxycarbofuran (a pesticide carbofuran metabolite), furalaxyl and propamocarb free base (fungicides), and 9-hydroxyfluorene (insecticide and algaecide).Two other cross-sectional studies showed lower levels of blood persistent organochlorine pesticides (POPs) in lean or physically active compared to obese or sedentary adults 87,88 .Dietary, lifestyle, and environmental exposures are still being investigated, but some of the environmental contaminants identified in this study tend to accumulate in the fatty tissues of commonly consumed livestock.Thus, a higher intake of red meat fat in the CON group may have increased the body-exposure burden of environmental contaminants 89 .
Pathway enrichment identified LIFE versus CON differences in the glycosphingolipid biosynthesis and metabolism pathway, with higher levels of the key metabolite phosphorylcholine.Glycosphingolipids (GSLs) are a specialized class of membrane lipids that support various cellular functions.Phosphorylcholine (PC) is the hydrophilic polar head group of some phospholipids and is a component of the platelet-activating factor and the phospholipids phosphatidylcholine and sphingomyelin 90 .Non-pathogenic antibodies against PC are naturally occurring and present in healthy adults.About 5-10% of circulating immunoglobulin M (IgM) consists of IgM anti-PC.IgM anti-PC is negatively associated with several chronic inflammatory conditions, including atherosclerosis, CVD, rheumatic diseases and chronic kidney disease (CKD) 90 .
Other metabolites of importance that were elevated in the LIFE group included beneficial fatty acids such as γ-linolenic acid, docosahexaenoic acid (DHA) and eicosatetraenoic acid (EPA).Lower levels of beneficial fatty acids have been reported in obese populations 18 .The reduced form of glutathione was significantly elevated in the  91 .Tryptamine, 2-hydroxyethyl)indole, and serotonin are gut microbial catabolites of tryptophan and were elevated in the LIFE group.These metabolites play roles in the gut-brain axis, immune surveillance, and inflammation regulation 92 .Two other gut microbial catabolites of tryptophan were decreased in the LIFE group including indole-3-methyl acetate and indole-3-propionic acid.Plasma betaine and lutein levels were higher in the LIFE group.Betaine is a methyl donor, regulates osmotic pressure, has positive effects on intestinal and kidney health, and exerts anti-inflammatory and anti-oxidative effects 93 .Lutein is a common carotenoid in plant foods.Several metabolites related to pain relief medications were elevated in the CON group, with higher levels of acetaminophen higher in the LIFE group.Plasma nicotine and cotinine were higher in the LIFE group and may indicate a higher prevalence of vaping.

Conclusions
The plasma metabolome reflects the collective influence of multiple lifestyle habits, genotype, clinical stressors, the gut microbiota, and other factors 94 .This cross-sectional study investigated differences in the plasma metabolome in two groups of adults that varied widely in body composition and dietary and physical activity patterns.
The results using an extensive untargeted UPLC-HRMS analysis with more than 10,000 metabolite peaks identified/annotated numerous metabolites and 16 metabolic pathways that differentiated LIFE and CON groups.A novel metabolite signature of positive lifestyle habits emerged from this analysis highlighted by lower plasma levels of numerous bile acids and an amino acid profile consistent with a reduced risk for chronic disease.This analysis also supported an elevated vitamin D status in the LIFE group, higher levels of beneficial fatty acids and gut microbiome catabolism metabolites from plant substrates, and reduced levels of N-glycan degradation metabolites and environmental contaminants.The LOOCV analysis supported the strong effect that body composition had on the plasma metabolome, with moderate effects of MVPA and fruit and vegetable intake.We propose that low-cost anthropometrics measurements could be combined with important metabolites from this analysis as precision nutrition indicators of a healthy versus unhealthy lifestyle.These metabolites could include lower plasma levels of glutamic acid, total bile acids, N-acetylneuraminic acid, and mannose, and higher levels of histidine, pipecolic acid, L-glutathione (reduced), succinic acid, γ-linolenic, DHA, EPA, hippuric acid, calcitriol, phosphorylcholine, uridine, 5-hydroxylysine, betaine, and lutein.
www.nature.com/scientificreports/Pathway enrichment analysis Pathway enrichment was conducted using all 10,535 peaks in the dataset (with mass-to-charge ratio (m/z), retention time, the p-value, and fold change information as input) for comparison of LIFE versus CON phenotypic groups.The plot of pathway enrichment factor vs. -log10 (p) is shown in Fig. 4, and pathways deemed significant (p < 0.05) calculated by MetaboAnalyst are numbered (P1 to P16) in the figure.The top 16 enriched pathways (ranked by significance (p-value < 0.05) are shown in Table 3, and the complete list of the pathways are shown in the Supplementary Table

Figure 2 .
Figure 2. Receiver-operator-characteristic (ROC) curves for LIFE and CON group discriminators trained on the entire metabolomics dataset.The blue curve is derived from category scores obtained from a single model optimized on all 104 samples, while the red curve depicts category scores obtained for each of 104 samples using 104 separate models optimized on 103 samples (leave-one-out cross-validation, LOO-CV).The area under the curves (AUC) for the single model and LOO models were 1.0 and 0.96, respectively, with p values of 7.7e−19 and 5.6e−16.
Figure 3.Leave-one-out cross validation (LOOCV) predictions using all metabolomics data for selected traits.Older age was strongly related to the metabolomics data with no differences between the LIFE and CON groups (r = 0.80, p = value = 5e−24).LIFE and CON group membership was strongly predicted using the plasma metabolomics data for three different body composition outcomes including BMI (r = 0.84, p-value = 3e−29), percent body fat (r = 0.80, p-value = 7e−24), and the sagittal abdominal diameter (SAD) (r = 0.82, p-value = 6e−27), and moderately predicted for the average number of daily servings of fruits and vegetables combined (r = 0.66, p-value = 3e−14), and the days per week for moderate-to-vigorous physical activity (MVPA) (r = 0.68, p-value = 4e−15).

Table 2 .
Library matched metabolite peaks that most significantly differentiated LIFE (n = 52) and CON (n = 52) groups (VIP ≥ 1 and p < 0.05 for group fold differences ≥ │1.8│).The table is sorted by Variable Influence on Projection (VIP).See Supplementary Table

Table 3 .
The top 16 pathways (p ≤ 0.05) enriched and ranked in the Mummichog pathway analysis in MetaboAnalyst. a Pathway total indicates the overall number of metabolites that are included in a specific pathway.b Hits.total indicates the number of measured signals that are matched (m/z error < 3 ppm) with the metabolites included in the pathway.c Hits.sig indicates the number of matched signals that were significantly changed between phenotypic groups.d FET is the right-tail p-value determined by the Fisher Exact Test for pathway enrichment.